AutoPart: Parameter-Free Graph Partitioning and Outlier Detection

نویسنده

Deepayan Chakrabarti

چکیده

Graphs arise in numerous applications, such as the analysis of the Web, router networks, social networks, co-citation graphs, etc. Virtually all the popular methods for analyzing such graphs, for example, k-means clustering, METIS graph partitioning and SVD/PCA, require the user to specify various parameters such as the number of clusters, number of partitions and number of principal components. We propose a novel way to group nodes, using information-theoretic principles to choose both the number of such groups and the mapping from nodes to groups. Our algorithm is completely parameter-free, and also scales practically linearly with the problem size. Further, we propose novel algorithms which use this node group structure to get further insights into the data, by finding outliers and computing distances between groups. Finally, we present experiments on multiple synthetic and real-life datasets, where our methods give excellent, intuitive results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Outlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means

One of the most important concerns of a data miner is always to have accurate and error-free data. Data that does not contain human errors and whose records are full and contain correct data. In this paper, a new learning model based on an extreme learning machine neural network is proposed for outlier detection. The function of neural networks depends on various parameters such as the structur...

متن کامل

Outlier Detection by Boosting Regression Trees

A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...

متن کامل

Parameter-free Outlier-aware Clustering on Attributed Graphs

متن کامل

Density-Based Clustering and Anomaly Detection

As of 1996, when a special issue on density-based clustering was published (DBSCAN) (Ester et al., 1996), existing clustering techniques focused on two categories: partitioning methods, and hierarchical methods. Partitioning clustering attempts to break a data set into K clusters such that the partition optimizes a given criterion. Besides difficulty in choosing the proper parameter K, and inca...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

AutoPart: Parameter-Free Graph Partitioning and Outlier Detection

نویسنده

چکیده

منابع مشابه

Identification of outliers types in multivariate time series using genetic algorithm

Outlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means

Outlier Detection by Boosting Regression Trees

Parameter-free Outlier-aware Clustering on Attributed Graphs

Density-Based Clustering and Anomaly Detection

عنوان ژورنال:

اشتراک گذاری